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RELATED APPLICATIONS 

[0001] This application claims priority of U.S. Provisional Application No. 60/223,323, filed 
August 7, 2000, which is herein incorporated by reference in its entirety. 

BACKGROUND OF THE INVENTION 

[0002] Benign Prostatic Hyperplasia (BPH) is the most common benign tumor in men aged >60 
years. It is estimated that one in four men living to the age of 80 will require treatment for this 
disease. BPH is usually noted clinically after the age of 50, the incidence increasing with age, 
but as many as two thirds of men between the ages of 40 and 49 demonstrate histological 
evidence of the disease. 

[0003] The anatomic location of the prostate at the bladder neck enveloping the urethra plays 
an important role in the pathology of BPH, including bladder outlet obstruction. Two prostate 
components are thought to play a role in bladder outlet obstruction. The first is the relative 
increased prostate tissue mass. The second component is the prostatic smooth muscle tone. 
[0004] The causative factors of BPH in man have been intensively studied. See Ziada et al. , 
Urology, 53: 1-6, 1999. In general, the two most important factors appear to be aging and the 
presence of functional testes. Although these factors appear to be key to the development of 
BPH, both appear to be nonspecific. 

[0005] Little is known about the molecular changes in prostate cells associated with the 
development and progression of BPH. It has been demonstrated that the expression levels of a 
number of individual genes are changed compared to normal prostate cells. These changes in 
gene expression include decreased expression of Wilm's tumor gene (WT-1) and increased 
expression of insulin growth factor II (IGF-II) (Dong et al., J. Clin. Endocrin. Metab., 82(7): 
2198-220). 

[0006] While the changes in the expression levels of a number of individual genes have been 
identified, the investigation of the global changes in gene expression has not been reported. 
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[0007] Accordingly, there exists a need for the investigation of the changes in global gene 
expression levels as well as the need for the identification of new molecular markers associated 
with the development and progression of BPH. Furthermore, if intervention is expected to be 
successful in halting or slowing down BPH, means of accurately assessing the early 
manifestations of BPH need to be established. One way to accurately assess the early 
manifestations of BPH is to identify markers which are uniquely associated with disease 
progression. Likewise, the development of therapeutics to prevent or stop the progression of 
BPH relies on the identification of genes responsible for BPH growth and function. 

SUMMARY OF THE INVENTION 

[0008] The present invention is based on the elucidation of the global changes in gene 
expression in BPH tissue isolated from patients exhibiting different clinical states of prostate 
hyperplasia as compared to normal prostate tissue as well as the identification of individual genes 
that are differentially expressed in BPH tissue. 

[0009] The invention is also based on the discovery of a means of effectively selecting disease- 
linked drug targets from gene expression results. The invention includes methods of classifying 
genes whose expression levels are changed in diseased tissues, during disease induction or during 
disease progression into specific groups. By using this method it is possible to classify genes 
whose expression are regulated by the same mechanism into the same group, and it is possible to 
identify representative marker genes by selecting typical genes from each cluster. 
[0010] The invention includes methods of screening for or identifying an agent that modulates 
the onset or progression of BPH, comprising: preparing a first gene expression profile of BPH 
cells; exposing the cells to the agent; preparing a second gene expression profile of the agent 
exposed cells; and comparing the first and second gene expression profiles. In a preferred 
embodiment of these methods, the gene expression profile comprises the expression levels of one 
or more or preferably two or more genes in Tables 1-5. In another preferred embodiment of 
these methods, the cell is a prostate cell from a BPH patient, a cell line in Table 6, or a derivative 
thereof. 

[0011] The invention also includes methods of monitoring a treatment of a patient with BPH, 
comprising administering a pharmaceutical composition to the patient; preparing a gene 
expression profile from a prostate cell or tissue sample from the patient; and comparing the 
patient gene expression profile to a gene expression profile from a normal prostate cell 
population, a BPH tissue or BPH cells without treatment with the pharmaceutical composition. In 
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preferred embodiments of these methods, the gene expression profile comprises the expression 
levels of one or more or, preferably two or more genes in Tables 1-5. 

[0012] The invention also includes methods of diagnosing benign prostatic hyperplasia (BPH) 
in a subject comprising the step of detecting the level of expression in a tissue or cell sample 
from the subject of two or more genes from Tables 1-5 (preferably Tables 3-5, and more 
preferably Table 5); wherein differential expression of the genes is indicative of BPH 
progression. 

[0013] The invention further includes methods of detecting the onset or progression of benign 
prostatic hyperplasia (BPH) in a patient comprising the step of detecting the level of expression 
in a tissue or cell sample of two or more genes from Tables 1-5 (preferably Tables 3-5, and more 
preferably Table 5); wherein differential expression of the genes is indicative of BPH 
progression. 

[0014] The invention also includes methods of differentiating benign prostatic hyperplasia 
(BPH) from prostate cancer in a patient comprising the step of detecting the level of expression 
in a tissue or cell sample of two or more genes from Tables 1-5 (preferably Tables 3-5, and more 
preferably Table 5); wherein differential expression of the genes is indicative of BPH rather than 
prostate cancer. 

[0015] The invention also includes methods of selecting or identifying cells that can be used for 
drug screening. 

[0016] All of these methods may include the step of detecting the expression levels of at least 
about 2, 3, 4, 5, 6, 7, 8, 9, 10 or more genes in any of Tables 1-5, or preferably Table 5. In a 
preferred embodiment, expression of all of the genes or nearly all of the genes in Tables 1-5, or 
preferably Table 5, may be detected. 

[0017] The invention further includes sets of at least two or more probes, wherein each of the 
probes comprises a sequence that specifically hybridizes to a gene in Tables 1 -5 as well as solid 
supports comprising at least two or more of the probes. 

[0018] The invention also includes computer systems comprising or linked to a database 
containing information identifying the expression level in BPH tissue or cells of a set of genes 
comprising at least two genes in Tables 1-5, preferably from Table 5; and a user interface to view 
the information. The database may further comprise sequence information for the genes as well 
as information identifying the expression level for the set of genes in normal prostate tissue or 
cells, and prostate cancer tissue. The database may further contain or be linked to descriptive 
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information from an external database, which information correlates said genes to records in the 
external database. 

[0019] The invention further includes methods of using the disclosed computer systems to 
present information identifying the expression level in a tissue or cell of a set of genes 
comprising at least one of the genes in Tables 1-5, preferably Table 5, comprising comparing the 
expression level of at least one gene in Tables 1-5, preferably Table 5, in the tissue or cell to the 
level of expression of the gene in the database. 

[0020] Lastly, the invention includes kits comprising probes or solid supports of the invention. 
In some embodiments, the kits also contain written materials or software concerning gene 
expression information for the genes of the invention, preferably in electronic format. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0021] Figure 1. Figure 1 shows the expression of cellular retinol binding protein RNA in 
various tissues. 

[0022] Figure 2. Figure 2 shows the expression of cellular retinol binding protein RNA in 
various prostate tissues samples. In all of the figures, "Normal", "-Sym", "Cancer" and "+Sym" 
refer to normal prostate, BPH without symptoms, prostate cancer, and BPH with symptoms, 
respectively. 

[0023] Figure 3. Figure 3 shows the expression of SI 00 calcium binding protein RNA in 
various tissues. 

[0024] Figure 4. Figure 4 shows the expression of SI 00 calcium binding protein RNA in 
various prostate tissue samples. 

[0025] Figure 5. Figure 5 shows the expression of PSMA RNA in various tissues. 
[0026] Figure 6. Figure 6 shows the expression of PSMA RNA in various prostate tissue 
samples. 

DETAILED DESCRIPTION 

[0027] Many biological functions are accomplished by altering the expression of various genes 
through transcriptional (e.g. through control of initiation, provision of RNA precursors, RNA 
processing, etc.) and/or translational control. For example, fundamental biological processes 
such as cell cycle, cell differentiation and cell death, are often characterized by the variations in 
the expression levels of groups of genes. 
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[0028] Changes in gene expression also are associated with pathogenesis. For example, the 
lack of sufficient expression of functional tumor suppressor genes and/or the over expression of 
oncogene/protooncogenes could lead to tumorgenesis or hyperplastic growth of cells (Marshall, 
Cell, 64: 313-326 (1991); Weinberg, Science, 254:1138-1146 (1991)). Thus, changes in the 
expression levels of particular genes (e.g. oncogenes or tumor suppressors) serve as signposts for 
the presence and progression of various diseases. 

[0029] Monitoring changes in gene expression may also provide certain advantages during drug 
screening development. Often drugs are screened for the ability to interact with a major target 
without regard to other effects the drugs have on cells. Often such other effects cause toxicity in 
the whole animal, which prevent the development and use of the potential drug. 
[0030] The present inventors have examined tissue from normal prostate, BPH and BPH 
prostate tissue immediately adjacent to malignant prostate tissue to identify the global changes in 
gene expression in BPH. These global changes in gene expression, also referred to as expression 
profiles, provide useful markers for diagnostic uses as well as markers that can be used to 
monitor disease states, disease progression, toxicity, drug efficacy and drug metabolism. 

Assay Formats 

[0031] The genes identified as being differentially expressed in BPH tissue or BPH cells 
(Tables 1-5) may be used in a variety of nucleic acid detection assays to detect or quantititate the 
expression level of a gene or multiple genes in a given sample. For example, traditional Northern 
blotting, nuclease protection, RT- PCR and differential display methods may be used for 
detecting gene expression levels. Those methods are useful for some embodiments of the 
invention. However, methods and assays of the invention are most efficiently designed with 
hybridization-based methods for detecting the expression of a large number of genes. 
[0032] Any hybridization assay format may be used, including solution-based and solid 
support-based assay formats. Solid supports containing oligonucleotide probes for differentially 
expressed genes of the invention can be filters, polyvinyl chloride dishes, silicon or glass based 
beads or chips, etc. Such supports and hybridization methods are widely available, for example, 
those disclosed by Beattie (WO 95/1 1755). Any solid surface to which oligonucleotides can be 
bound, either directly or indirectly, either covalently or non-covalently, can be used. 
[0033] A preferred solid support is a high density array or DNA chip. These contain a particular 
oligonucleotide probe in a predetermined location on the array. Each predetermined location 
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may contain more than one molecule of the probe, but each molecule within the predetermined 
location has an identical sequence. Such predetermined locations are termed features. There 
maybe, for example, from 2, 10, 100, 1000 to 10,000, 100,000 or 400,000 of such features on a 
single solid support. The solid support, or the area within which the probes are attached may be 
on the order of about a square centimeter. 

[0034] Oligonucleotide probe arrays for expression monitoring can be made and used according 
to any technique known in the art (see for example, Lockhart et al, Nat. Biotechnol. (1996) 14, 
1675-1680; McGall et al, Proc. Nat. Acad. Sci. USA (1996) 93, 13555-13460). Such probe 
arrays may contain at least two or more oligonucleotides that are complementary to or hybridize 
to two or more of the genes described in Tables 1-5 . For instance, such arrays may contain 
oligonucleotides that are complementary or hybridize to at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 
30, 50, 70 or more the genes described herein. 

[0035] The genes which are assayed according to the present invention are typically in the form 
of mRNA or reverse transcribed mRNA. The genes may be cloned or not. The genes may be 
amplified or not. The cloning itself does not appear to bias the representation of genes within a 
population. However, it may be preferable to use polyA+ RNA as a source, as it can be used 
with less processing steps. 

[0036] The sequences and related information of the genes described herein are available in the 
public databases. Tables 1-5 provide the Accession numbers and name for each of the 
sequences. The sequences and related information of the genes listed in the Tables according to 
their GenBank identifiers are expressly incorporated herein as of the filing date of this 
application (see: www.ncbi.nlm.nih. gov/V 

[0037] Probes based on the sequences of the genes described above may be prepared by any 
commonly available method. Oligonucleotide probes for interrogating the tissue or cell sample 
are preferably of sufficient length to specifically hybridize only to appropriate, complementary 
genes or transcripts. Typically the oligonucleotide probes will be at least 10, 12, 14, 16, 18, 20 or 
25 nucleotides in length. In some cases longer probes of at least 30, 40, or 50 nucleotides will be 
desirable. 

[0038] As used herein, oligonucleotide sequences that are complementary to one or more of the 
genes described in Tables 1-5 refer to oligonucleotides that are capable of hybridizing under 
stringent conditions to at least part of the nucleotide sequence of said genes. Such hybridizable 
oligonucleotides will typically exhibit at least about 75% sequence identity at the nucleotide level 
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to said genes, preferably about 80% or 85% sequence identity or more preferably about 90% or 
95% or more sequence identity to said genes. 

[0039] "Bind(s) substantially" refers to complementary hybridization between a probe nucleic 
acid and a target nucleic acid and embraces minor mismatches that can be accommodated by 
reducing the stringency of the hybridization media to achieve the desired detection of the target 
polynucleotide sequence. 

[0040] The terms "background" or "background signal intensity" refer to hybridization signals 
resulting from non-specific binding, or other interactions, between the labeled target nucleic 
acids and components of the oligonucleotide array (e.g., the oligonucleotide probes, control 
probes, the array substrate, etc.). Background signals may also be produced by intrinsic 
fluorescence of the array components themselves. A single background signal can be calculated 
for the entire array, or a different background signal may be calculated for each target nucleic 
acid. In a preferred embodiment, background is calculated as the average hybridization signal 
intensity for the lowest 5% to 10% of the probes in the array, or, where a different background 
signal is calculated for each target gene, for the lowest 5% to 10% of the probes for each gene. 
Of course, one of skill in the art will appreciate that where the probes to a particular gene 
hybridize well and thus appear to be specifically binding to a target sequence, they should not be 
used in a background signal calculation. Alternatively, background may be calculated as the 
average hybridization signal intensity produced by hybridization to probes that are not 
complementary to any sequence found in the sample (e.g. probes directed to nucleic acids of the 
opposite sense or to genes not found in the sample such as bacterial genes where the sample is 
mammalian nucleic acids). Background can also be calculated as the average signal intensity 
produced by regions of the array that lack probes. 

[0041] The phrase "hybridizing specifically to" refers to the binding, duplexing, or hybridizing 
of a molecule substantially to or only to a particular nucleotide sequence or sequences under 
stringent conditions when that sequence is present in a complex mixture (e.g., total cellular DNA 
or RNA). 

[0042] Assays and methods of the invention may utilize available formats to simultaneously 
screen at least about 100, preferably about 1000, more preferably about 10,000 and most 
preferably about 1,000,000 different nucleic acid hybridizations. 

[0043] As used herein a "probe" is defined as a nucleic acid molecule, capable of binding to a 
target nucleic acid of complementary sequence through one or more types of chemical bonds, 
usually through complementary base pairing, usually through hydrogen bond formation. As used 
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herein, a probe may include natural (i.e., A, G, U, C, or T) or modified bases (7-deazaguanosine, 
inosine, etc.). In addition, the bases in probes may be joined by a linkage other than a 
phosphodiester bond, so long as it does not interfere with hybridization. Thus, probes may be 
peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than 
phosphodiester linkages. 

[0044] The term "stringent conditions" refers to conditions under which a probe will hybridize 
to its target subsequence, but with only insubstantial hybridization to other sequences or to other 
sequences such that the difference may be identified. Stringent conditions are sequence- 
dependent and will be different in different circumstances. Longer sequences hybridize 
specifically at higher temperatures. Generally, stringent conditions are selected to be about 5oC 
lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength 
and pH. 

[0045] Typically, stringent conditions will be those in which the salt concentration is at least 
about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is 
at least about 30°C for short probes (e.g., 10 to 50 nucleotide). Stringent conditions may also be 
achieved with the addition of destabilizing agents such as formamide. 
[0046] The "percentage of sequence identity" or "sequence identity" is determined by 
comparing two optimally aligned sequences or subsequences over a comparison window or span, 
wherein the portion of the polynucleotide sequence in the comparison window may optionally 
comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does 
not comprise additions or deletions) for optimal alignment of the two sequences. The percentage 
is calculated by determining the number of positions at which the identical submit (e.g. nucleic 
acid base or amino acid residue) occurs in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in the 
window of comparison and multiplying the result by 100 to yield the percentage of sequence 
identity. Percentage sequence identity when calculated using the programs GAP or BESTFIT 
(see below) is calculated using default gap weights. 

Probe design 

[0047] One of skill in the art will appreciate that an enormous number of array designs are 
suitable for the practice of this invention. The high density array will typically include a number 
of probes that specifically hybridize to the sequences of interest. See WO 99/32660 for methods 
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of producing probes for a given gene or genes. In addition, in a preferred embodiment, the array 
will include one or more control probes. 

[0048] High density array chips of the invention include "test probes." Test probes could be 
oligonucleotides that range from about 5 to about 500 or 5 to about 45 nucleotides, more 
preferably from about 10 to about 40 nucleotides and most preferably from about 15 to about 40 
nucleotides in length. In other particularly preferred embodiments the probes are 20 or 25 
nucleotides in length. In another preferred embodiment, test probes are double or single strand 
DNA sequences. DNA sequences are isolated or cloned from natural sources or amplified from 
natural sources using native nucleic acid as templates. These probes have sequences 
complementary to particular subsequences of the genes whose expression they are designed to 
detect. Thus, the test probes are capable of specifically hybridizing to the target nucleic acid they 
are to detect (the genes of Tables 1-5). 

[0049] The term "perfect match probe" refers to a probe that has a sequence that is perfectly 
complementary to a particular target sequence. The probe is typically perfectly complementary to 
a portion (subsequence) of the target sequence. The perfect match (PM) probe can be a "test 
probe", a "normalization control" probe, an expression level control probe and the like. A 
perfect match control or perfect match probe is, however, distinguished from a "mismatch 
control" or "mismatch probe." 

[0050] In addition to test probes that bind the target nucleic acid(s) of interest, the high density 
array can contain a number of control probes. The control probes fall into three categories 
referred to herein as 1) normalization controls; 2) expression level controls; and 3) mismatch 
controls. 

[0051] Normalization controls are oligonucleotide or other nucleic acid probes that are 
complementary to labeled reference oligonucleotides or other nucleic acid sequences that are 
added to the nucleic acid sample to be screened. The signals obtained from the normalization 
controls after hybridization provide a control for variations in hybridization conditions, label 
intensity, "reading" efficiency and other factors that may cause the signal of a perfect 
hybridization to vary between arrays. In a preferred embodiment, signals (e.g., fluorescence 
intensity) read from all other probes in the array are divided by the signal (e.g. , fluorescence 
intensity) from the control probes thereby normalizing the measurements. 

[0052] Virtually any probe rriay serve as a normalization control. However, it is recognized that 
hybridization efficiency varies with base composition and probe length. Preferred normalization 
probes are selected to reflect the average length of the other probes present in the array, however, 
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they can be selected to cover a range of lengths. The normalization control(s) can also be selected 
to reflect the (average) base composition of the other probes in the array, however in a preferred 
embodiment, only one or a few probes are used and they are selected such that they hybridize 
well (i.e., no secondary structure) and do not match any target-specific probes. 
[0053] Expression level controls are probes that hybridize specifically with constitutively 
expressed genes in the biological sample. Virtually any constitutively expressed gene provides a 
suitable target for expression level controls. Typically expression level control probes have 
sequences complementary to subsequences of constitutively expressed "housekeeping genes" 
including, but not limited to an actin gene, the transferrin receptor gene, the GAPDH gene, and 
the like. 

[0054] Mismatch controls or mismatch probes may also be provided for the probes to the target 
genes, for expression level controls or for normalization controls. Mismatch controls are 
oligonucleotide probes or other nucleic acid probes identical to their corresponding test or control 
probes except for the presence of one or more mismatched bases. A mismatched base is a base 
selected so that it is not complementary to the corresponding base in the target sequence to which 
the probe would otherwise specifically hybridize. One or more mismatches are selected such that 
under appropriate hybridization conditions (e.g., stringent conditions) the test or control probe 
would be expected to hybridize with its target sequence, but the mismatch probe would not 
hybridize (or would hybridize to a significantly lesser extent). Preferred mismatch probes 
contain a central mismatch. Thus, for example, where a probe is a 20 mer, a corresponding 
mismatch probe will have the identical sequence except for a single base mismatch (e.g., 
substituting a G, a C or a T for an A) at any of positions 6 through 14 (the central mismatch). 
[0055] Mismatch probes thus provide a control for non-specific binding or cross hybridization 
to a nucleic acid in the sample other than the target to which the probe is directed. Mismatch 
probes also indicate whether a hybridization is specific or not. For example, if the target is 
present the perfect match probes should be consistently brighter than the mismatch probes. In 
addition, if all central mismatches are present, the mismatch probes can be used to detect a 
mutation. The difference in intensity between the perfect match and the mismatch probe 
provides a good measure of the concentration of the hybridized material. 

Nucleic Acid Samples 
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[0056] As is apparent to one of ordinary skill in the art, nucleic acid samples used in the 
methods and assays of the invention may be prepared by any available method or process. 
Methods of isolating total mRNA are well known to those of skill in the art. For example, 
methods of isolation and purification of nucleic acids are described in detail in Chapter 3 of 
Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid 
Probes, Part I Theory and Nucleic Acid Preparation, P. Tijssen, Ed., Elsevier, N.Y. (1993). Such 
samples include RNA samples, but also include cDNA synthesized from a mRNA sample 
isolated from a cell or tissue of interest. Such samples also include DNA amplified from the 
cDNA, and RNA transcribed from the amplified DNA. One of skill in the art would appreciate 
that it is desirable to inhibit or destroy RNase present in homogenates before homogenates can be 
used. 

[0057] Biological samples may be of any biological tissue or fluid or cells from any organism 
as well as cells raised in vitro, such as cell lines and tissue culture cells. Biological samples may 
also include sections of tissues, such as frozen sections or formalin fixed sections taken for 
histological purposes. Frequently, the sample will be a "clinical sample" which is a sample 
derived from a patient. Typical clinical samples include, but are not limited to prostate tissue, 
urine, sputum, blood, blood-cells (e.g., white cells or peripheral blood leukocytes (PBL), tissue or 
fine needle biopsy samples, peritoneal fluid, and pleural fluid, or cells therefrom. 

Forming High Density Arrays. 

[0058] Methods of forming high density arrays of oligonucleotides with a minimal number of 
synthetic steps are known. The oligonucleotide analogue array can be synthesized on a solid 
substrate by a variety of methods, including, but not limited to, light-directed chemical coupling, 
and mechanically directed coupling. See Pirrung et ah, U.S. Patent No. 5,143, 854. 
[0059] In brief, the light-directed combinatorial synthesis of oligonucleotide arrays on a glass 
surface proceeds using automated phosphoramidite chemistry and chip masking techniques. In 
one specific implementation, a glass surface is derivatized with a silane reagent containing a 
functional group, e.g., a hydroxyl or amine group blocked by a photolabile protecting group. 
Photolysis through a photolithogaphic mask is used selectively to expose functional groups 
which are then ready to react with incoming 5' photoprotected nucleoside phosphoramidites. The 
phosphoramidites react only with those sites which are illuminated (and thus exposed by removal 
of the photolabile blocking group). Thus, the phosphoramidites only add to those areas 



Atty Docket No. 44921 -5029-US 
Doc. No. 1622848.1 

-12- 

selectively exposed from the preceding step. These steps are repeated until the desired array of 
sequences have been synthesized on the solid surface. Combinatorial synthesis of different 
oligonucleotide analogues at different locations on the array is determined by the pattern of 
illumination during synthesis and the order of addition of coupling reagents. 
[0060] In addition to the foregoing, additional methods which can be used to generate an array 
of oligonucleotides on a single substrate are described WO 93/09668. High density nucleic acid 
arrays can also be fabricated by depositing premade or natural nucleic acids in predetermined 
positions. Synthesized or natural nucleic acids are deposited on specific locations of a substrate 
by light directed targeting and oligonucleotide directed targeting. Another embodiment uses a 
dispenser that moves from region to region to deposit nucleic acids in specific spots. 

Hybridization 

[0061] Nucleic acid hybridization simply involves contacting a probe and target nucleic acid 
under conditions where the probe and its complementary target can form stable hybrid duplexes 
through complementary base pairing. See WO 99/32660. The nucleic acids that do not form 
hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, 
typically through detection of an attached detectable label. It is generally recognized that nucleic 
acids are denatured by increasing the temperature or decreasing the salt concentration of the 
buffer containing the nucleic acids. Under low stringency conditions {e.g., low temperature 
and/or high salt) hybrid duplexes {e.g., DNA:DNA, RNA:RNA, or RNA:DNA) will form even 
where the annealed sequences are not perfectly complementary. 

[0062] Thus specificity of hybridization is reduced at lower stringency. Conversely, at higher 
stringency {e.g., higher temperature or lower salt) successful hybridization tolerates fewer 
mismatches. One of skill in the art will appreciate that hybridization conditions may be selected 
to provide any degree of stringency. In a preferred embodiment, hybridization is performed at 
low stringency in this case in 6X SSPE-T at 37°C (0.005% Triton X-100) to ensure hybridization 
and then subsequent washes are performed at higher stringency {e.g., I X SSPE-T at 37oC) to 
eliminate mismatched hybrid duplexes. Successive washes may be performed at increasingly 
higher stringency {e.g., down to as low as 0.25 X SSPET at 37°C to 50°C) until a desired level of 
hybridization specificity is obtained. Stringency can also be increased by addition of agents such 
as formamide. Hybridization specificity may be evaluated by comparison of hybridization to the 
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test probes with hybridization to the various controls that can be present (e.g., expression level 
control, normalization control, mismatch controls, etc.). 

[0063] In general, there is a tradeoff between hybridization specificity (stringency) and signal 
intensity. Thus, in a preferred embodiment, the wash is performed at the highest stringency that 
produces consistent results and that provides a signal intensity greater than approximately 10% of 
the background intensity. Thus, in a preferred embodiment, the hybridized array may be washed 
at successively higher stringency solutions and read between each wash. Analysis of the data 
sets thus produced will reveal a wash stringency above which the hybridization pattern is not 
appreciably altered and which provides adequate signal for the particular oligonucleotide probes 
of interest. 



Signal Detection 

[0064] The hybridized nucleic acids are typically detected by detecting one or more labels 
attached to the sample nucleic acids. The labels may be incorporated by any of a number of 
means well known to those of skill in the art. See WO 99/32660. 

Databases 

[0065] The present invention includes relational databases containing sequence information, for 
instance for the genes of Tables 1-5, as well as gene expression information in various prostate 
tissue samples. Databases may also contain information associated with a given sequence or 
tissue sample such as descriptive information about the gene associated with the sequence 
information, metabolic pathway information for the gene or descriptive information concerning 
the clinical status of the tissue sample, or the patient from which the sample was derived. Such 
information for the patient may include, but is not limited to sex, age, disease status, general 
health information, surgical or treatment status, PSA levels, as well as information concerning 
the patient's clinical symptoms. The database may be designed to include different parts, for 
instance a sequence database and a gene expression database. Methods for the configuration and 
construction of such databases are widely available, for instance, see U.S. Patent 5,953,727, 
which is herein incorporated by reference in its entirety. 
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[0066] The databases of the invention may be linked to an outside or external database. In a 
preferred embodiment, as described in Tables 1-5, the external database is GenBank and the 
associated databases maintained by the National Center for Biotechnology Information (NCBI). 
[0067] Any appropriate computer platform may be used to perform the necessary comparisons 
between sequence information, gene expression information and any other information in the 
database or provided as an input. For example, a large number of computer workstations are 
available from a variety of manufacturers, such has those available from Silicon Graphics. 
Client/server environments, database servers and networks are also widely available and 
appropriate platforms for the databases of the invention. 

[0068] The databases of the invention may be used to produce, among other things, electronic 
Northerns that allow the user to determine the cell type or tissue in which a given gene is 
expressed and to allow determination of the abundance or expression level of a given gene in a 
particular tissue or cell. r 

[0069] The databases of the invention may also be used to present information identifying the 
expression level in a tissue or cell of a set of genes comprising at least two of the genes in Tables 
1-5, comprising the step of comparing the expression level of at least one gene in Tables 1-5 
found or detected in the tissue to the level of expression of the gene in the database. Such 
methods may be used to predict the hyperplastic state of a given tissue by comparing the level of 
expression of a gene or genes in Tables 1-5 from a sample to the expression levels found in 
normal prostate cells, BPH cells or tissue and/or malignant or cancerous prostate tissue. Such 
methods may also be used in the drug or agent screening assays as described below. 

Selection of BPH-Associated Genes 

[0070] BPH associated genes may be identified or selected by any available method, including 
subtractive hybridization protocols, differential display protocols and high-throughput 
hybridization formats, including oligonucleotide and cDNA microarray technologies. 
[0071] Unprocessed or raw expression levels may be normalized, standardized and/or analyzed 
by any available computational method, including the expression level normalization, analysis 
and clustering methods herein described. The normalization method as described in Example 4 
may be combined with any further analysis method, including any clustering methods available 
in the art. 
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Diagnostic Uses for the BPH Markers 

[0072] As described above, the genes and gene expression information provided in Tables 1-5 
may be used as diagnostic markers for the prediction or identification of the hyperplastic state of 
a prostate or other tissue. For instance, a prostate tissue or other patient sample may be assayed 
by any of the methods described above, and the expression levels from a gene or genes from 
Tables 1-5 may be compared to the expression levels found in normal prostate tissue, BPH tissue 
or BPH tissue from a patient with metastatic or nonmetastatic prostate cancer. In some instances, 
patient PBLs may be used as the patient sample. The comparison of expression data, as well as 
available sequence or other information may be done by researcher or diagnostician or may be 
done with the aid of a computer and databases as described above. 

Use of the BPH Markers for Monitoring Disease Progression 

[0073] As described above, the genes and gene expression information provided in Tables 1-5 
may also be used as markers for the monitoring of disease progression, such as the development 
of BPH. For instance, a prostate tissue or other patient sample may be assayed by any of the 
methods described above, and the expression levels from a gene or genes from Tables 1-5 may 
be compared to the expression levels found in normal prostate tissue, BPH tissue or BPH tissue 
from a patient with metastatic or nonmetastatic prostate cancer. The comparison of the 
expression data, as well as available sequence or other information may be done by researcher or 
diagnostician or may be done with the aid of a computer and databases as described above. 
[0074] The BPH markers of the invention may also be used to track or predict the progress or 
efficacy of a treatment regime in a patient. For instance, a patient's progress or response to a 
given drug may be monitored by creating a gene expression profile from a tissue or cell sample 
after treatment or administration of the drug. The gene expression profile may then be compared 
to a gene expression profile prepared from normal cells or tissue, for instance, normal prostate 
tissue. The gene expression profile may also be compared to a gene expression profile prepared 
from BPH or malignant prostate cells, or from tissue or cells from the same patient before 
treatment. The gene expression profile may be made from at least one gene, preferably more 
than one gene, and most preferably all or nearly all of the genes in Tables 1-5. 

Use of the BPH Markers for Drug Screening 
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[0075] According to the present invention, the genes identified in Tables 1-5 can be used as 
markers to screen for potential 1 therapeutic agents or compounds to treat BPH or prostate cancer. 
A candidate drug or agent can be screened for the ability to stimulate the transcription or 
expression of a given marker or to down-regulate or counteract the transcription or expression of 
a marker or markers. Compounds that modulate the expression level of single gene and also 
compounds that modulate the expression level of multiple genes from levels associated with a 
specific disease state to a normal state can be screened by using the markers and profiles 
identified herein. 

[0076] According to the present invention, one can also compare the specificity of drug's 
effects by looking at the number of markers which are differentially expressed after drug 
exposure and comparing them. More specific drugs will have less transcriptional targets. 
Similar sets of markers identified for two drugs may indicate a similarity of effects. 
[0077] Assays to monitor the expression of a marker or markers as defined in Tables 1-5 may 
utilize any available means of monitoring for changes in the expression level of the nucleic acids 
of the invention. As used herein, an agent is said to modulate the expression of a nucleic acid of 
the invention if it is capable of up- or down-regulating expression of the nucleic acid in a cell. 
[0078] In one assay format, gene chips containing probes to at least 2 genes from Tables 1-5 
may be used to directly monitor or detect changes in gene expression in the treated or exposed 
cell as described in more detail above. In another format, the changes of mRNA expression level 
can be detected using QuantiGene technology (Warrior et. al. (2000) J. Biomolecular Screening, 
5, 343-35 1). Specific probes used for QuantiGene can be designed and synthesized to one or 
more genes from Tables 1-5. Cells treated with compounds are lysed by lysis buffer. The 
amount of target mRNA can be detected as a luminescence intensity using target specific probes. 
[0079] In another format, cell lines that contain reporter gene fusions between the open reading 
frame and/or 573' regulatory regions of a gene in Tables 1-5 and any assayable fusion partner 
may be prepared. Numerous assayable fusion partners are known and readily available including 
the firefly luciferase gene and the gene encoding chloramphenicol acetyltransferase (Alam et al. 
(1990) Anal. Biochem. 188:245-254). Cell lines containing the reporter gene fusions are then 
exposed to the agent to be tested under appropriate conditions and time. Differential expression 
of the reporter gene between samples exposed to the agent and control samples identifies agents 
which modulate the expression of the nucleic acid. 

[0080] Additional assay formats may be used to monitor the ability of the agent to modulate the 
expression of a gene identified in Tables 1-5. For instance, as described above, mRNA 
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expression may be monitored directly by hybridization of probes to the nucleic acids of the 
invention. Cell lines are exposed to the agent to be tested under appropriate conditions and time 
and total RNA or mRNA is isolated by standard procedures such those disclosed in Sambrook et 
al. {Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Laboratory Press, 
1989). 

[0081] In another assay format, cells or cell lines are first identified which express the gene 
products of the invention physiologically (see below). Cell and/or cell lines so identified would 
be expected to comprise the necessary cellular machinery such that the fidelity of modulation of 
the transcriptional apparatus is maintained with regard to exogenous contact of agent with 
appropriate surface transduction mechanisms and/or the cytosolic cascades. Such cell lines may 
be, but are not required to be, prostate derived. Further, such cells or cell lines may be 
transduced or transfected with an expression vehicle (e.g., a plasmid or viral vector) construct 
comprising an operable non-translated 5'-promoter containing end of the structural gene encoding 
the instant gene products fused to one or more antigenic fragments, which are peculiar to the 
instant gene products, wherein said fragments are under the transcriptional control of said 
promoter and are expressed as polypeptides whose molecular weight can be distinguished from 
the naturally occurring polypeptides or may further comprise an immunologically distinct tag or 
some other detectable marker or tag. Such a process is well known in the art (see Maniatis). 
[0082] Cells or cell lines transduced or transfected as outlined above are then contacted with 
agents under appropriate conditions; for example, the agent Comprises a pharmaceutically 
acceptable excipient and is contacted with cells comprised in an aqueous physiological buffer 
such as phosphate buffered saline (PBS) at physiological pH, Eagles balanced salt solution (BSS) 
at physiological pH, PBS or BSS comprising serum or conditioned media comprising PBS or 
BSS and/or serum incubated at 37°C. Said conditions may be modulated as deemed necessary by 
one of skill in the art. Subsequent to contacting the cells with the agent, said cells are disrupted 
and the polypeptides of the lysate are fractionated such that a polypeptide fraction is pooled and 
contacted with an antibody to be further processed by immunological assay (e.g. , ELISA, 
immunoprecipitation or Western blot). The pool of proteins isolated from the "agent-contacted" 
sample is then compared with a control sample where only the excipient is contacted with the 
cells and an increase or decrease in the immunologically generated signal from the "agent- 
contacted" sample compared to the control is used to distinguish the effectiveness of the agent. 
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[0083] Another embodiment of the present invention provides methods for identifying agents 
that modulate at least one activity of a protein(s) encoded by the genes in Tables 1-5. Such 
methods or assays may utilize any means of monitoring or detecting the desired activity. 
[0084] In one format, the relative amounts of a protein of the invention between a cell 
population that has been exposed to the agent to be tested compared to an un-exposed control cell 
population may be assayed. In this format, probes such as specific antibodies are used to monitor 
the differential expression of the protein in the different cell populations. Cell lines or 
populations are exposed to the agent to be tested under appropriate conditions and time. Cellular 
lysates may be prepared from the exposed cell line or population and a control, unexposed cell 
line or population. The cellular lysates are then analyzed with the probe, such as a specific 
antibody. 

[0085] Agents that are assayed in the above methods can be randomly selected or rationally 
selected or designed. As used herein, an agent is said to be randomly selected when the agent is 
chosen randomly without considering the specific sequences involved in the association of the a 
protein of the invention alone or with its associated substrates, binding partners, etc. An example 
of randomly selected agents is the use a chemical library or a peptide combinatorial library, or a 
growth broth of an organism. 

[0086] As used herein, an agent is said to be rationally selected or designed when the agent is 
chosen on a nonrandom basis which takes into account the sequence of the target site and/or its 
conformation in connection with the agent's action. Agents can be rationally selected or 
rationally designed by utilizing the peptide sequences that make up these sites. For example, a 
rationally selected peptide agent can be a peptide whose amino acid sequence is identical to or a 
derivative of any functional consensus site. 

[0087] The agents of the present invention can be, as examples, peptides, small molecules, 
vitamin derivatives, as well as carbohydrates. Dominant negative proteins, DNAs encoding these 
proteins, antibodies to these proteins, peptide fragments of these proteins or mimics of these 
proteins may be introduced into cells to affect function. "Mimic" used herein refers to the 
modification of a region or several regions of a peptide molecule to provide a structure 
chemically different from the parent peptide but topographically and functionally similar to the 
parent peptide (see Grant GA. in: Meyers (ed.) Molecular Biology and Biotechnology (New 
York, VCH Publishers, 1995), pp. 659-664). A skilled artisan can readily recognize that there is 
no limit as to the structural nature of the agents of the present invention. 
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Cells used for Multi Gene Screening 

[0088] Many kinds of cells such as primary cells and cell lines can be used for the drug 
screening methods of the invention. Cells or cell lines derived from prostatic tissues are 
preferred because the innate gene expression mechanisms of these cells often resemble those of 
prostatic tissues. Cells used for drug screening can be selected by assaying for the expression of 
one or more of the marker genes listed in Tables 1-5. The cells which differentially express one 
or more, or preferably nearly all of the marker genes listed in Tables 1-5 are preferred cells or 
cell lines for the methods of the invention (see Table 6). 

Kits 

[0089] The invention further includes kits combining, in different combinations, high-density 
oligonucleotide arrays, reagents for use with the arrays, signal detection and array-processing 
instruments, gene expression databases and analysis and database management software 
described above. The kits may be used, for example, to diagnose the disease state of a tissue or 
cell sample, to monitor the progression of prostate disease states, to identify genes that show 
promise as new drug targets and to screen known and newly designed drugs as discussed above. 
[0090] The databases packaged with the kits are a compilation of expression patterns from 
human and laboratory animal genes and gene fragments (corresponding to the genes of Tables 1- 
5). In particular, the database software and packaged information include the expression results 
of Tables 1-5 that can be used is the assays and methods as herein described. 
[0091] The kits may used in the pharmaceutical industry, where the need for early drug testing 
is strong due to the high costs associated with drug development, but where bioinformatics, in 
particular gene expression informatics, is still lacking. These kits will reduce the costs, time and 
risks associated with traditional new drug screening using cell cultures and laboratory animals. 
The results of large-scale drug screening of pre-grouped patient populations, pharmacogenomics 
testing, can also be applied to select drugs with greater efficacy and fewer side-effects. The kits 
may also be used by smaller biotechnology companies and research institutes who do not have 
the facilities for performing such large-scale testing themselves. 

[0092] Databases and software designed for use with use with microarrays is discussed in 
Balaban et al, U.S. Patent Nos. 6,229,91 1, a computer-implemented method for managing 
information, stored as indexed tables, collected from small or large numbers of microarrays, and 
6,185,561, a computer-based method with data mining capability for collecting gene expression 
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level data, adding additional attributes and reformatting the data to produce answers to various 
queries. Chee et al, U.S. Patent No. 5,974,164, disclose a software-based method for identifying 
mutations in a nucleic acid sequence based on differences in probe fluorescence intensities 
between wild type and mutant sequences that hybridize to reference sequences 
[0093] Without further description, it is believed that one of ordinary skill in the art can, using 
the preceding description and the following illustrative examples, make and utilize the genes, 
chips, etc. of the present invention and practice the claimed methods. The following working 
examples therefore, specifically point out the preferred embodiments of the present invention, 
and are not to be construed as limiting in any way the remainder of the disclosure. 

EXAMPLES 

Example 1; Gene chip expression analysis 

[0094] BPH, normal prostate tissue, and prostate tissue adjacent to malignant prostate tissue 
were obtained from human biopsy samples. 

[0095] Microarray sample preparation was conducted with minor modifications, following the 
protocols set forth in the Affymetrix GeneChip Expression Analysis Manual. Frozen tissue was 
ground to a powder using a Spex Certiprep 6800 Freezer Mill. Total RNA was extracted with 
Trizol (GibcoBRL) utilizing the manufacturer's protocol. The total RNA yield for each sample 
was 200-500 ug per 300 mg tissue weight. mRNA was isolated using the Oligotex mRNA Midi 
kit (Qiagen) followed by ethanol precipitation. Double stranded cDNA was generated from 
mRNA using the Superscript Choice system (GibcoBRL). First strand cDNA synthesis was 
primed with a T7-(dT24) oligonucleotide. The cDNA was phenol-chloroform extracted and 
ethanol precipitated to a final concentration of 1 jj.g/ml. From 2 \ig of cDNA, cRNA was 
synthesized using Ambion's T7 MegaScript in vitro Transcription Kit. 

[0096] To biotin label the cRNA, nucleotides Bio- 1 1 -CTP and Bio-16-UTP (Enzo Diagnostics) 
were added to the reaction. Following a 37°C incubation for six hours, impurities were removed 
from the labeled cRNA following the RNeasy Mini kit protocol (Qiagen). cRNA was 
fragmented (fragmentation buffer consisting of 200 mM Tris-acetate, pH 8.1, 500 mM KOAc, 
150 mM MgOAc) for thirty-five minutes at 94°C. Following the Affymetrix protocol, 55 ug of 
fragmented cRNA was hybridized on the Affymetrix Human 42K array set for twenty- four hours 
at 60 rpm in a 45°C hybridization oven. The chips were washed and stained with Streptavidin 
Phycoerythrin (SAPE) (Molecular Probes) in Affymetrix fluidics stations. To amplify staining, 
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SAPE solution was added twice with an anti-streptavidin biotinylated antibody (V ector 
Laboratories) staining step in between. Hybridization to the probe arrays was detected by 
fluorometric scanning (Hewlett Packard Gene Array Scanner). Data was analyzed using 
Affymetrix GeneChip version 3.0 and Expression Data Mining Tool (EDMT) software (version 
1.0). 

[0097] Differential expression of genes between the BPH and normal prostate samples were 
determined using the Affymetrix GeneChip human gene chip set by the following criteria: 1) For 
each gene, Affymetrix GeneChip average difference values were determined by standard 
Affymetrix EDMT software algorithms, which also made "Absent" (=not specifically detected as 
gene expression), "Present" (=detected) or "Marginal" (=not clearly Absent or Present) calls for 
each GeneChip element; 2) all AveDiff values which were less than +20 (positive 20) were 
raised to a floor of +20 so that fold change calculations could be made where values were not 
already greater than or equal to +20; 3) median levels of expression were compared between the 
normal control group and the BPH with symptoms disease group to obtain greater than or equal 
2-fold up/down values; 4) The median value for the higher expressing group needed to be greater 
or equal to 200 average difference units in order to be considered for statistical significance; 5) 
Genes passing the criteria of #1-4 were analyzed for statistical significance using a two-tailed T 
test and deemed statistically significant if p < 0.05. Tables 1 and 2 list the genes and their levels 
of differential expression (compared to normal samples) in BPH tissue from patients with 
symptoms of BPH and in BPH tissue immediately adjacent to malignant prostate tissue isolated 
from male patients. 

Example 2 : Expression profile analysis 

[0098] Gene expression profiles between normal sample and BPH patient samples were 
determined by using the following samples: 10 normal; 7 BPH without symptoms; 8 BPH with 
cancer; and 8 BPH with symptoms. Gene expression profiles were prepared using the 42K 
Affymetrix Gene Chip set. The methods used were the same as described in Example 1 with the 
exception of the criteria to select the marker genes. 

[0099] The criteria used in this study were as follows; 1) For each gene, Affymetrix GeneChip 
average difference values were determined by standard Affymetrix EDMT software algorithms, 
which also made "Absent" (=not specifically detected as gene expression), "Present" (=detected) 
or "Marginal" (=not clearly Absent or Present) calls for each GeneChip element; 2) all AveDiff 
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values which were less than +20 (positive 20) were raised to a floor of +20 so that fold change 
calculations could be made where values were not already greater than or equal to +20; 3) mean 
levels of expression were compared between the normal control group and the BPH with 
symptoms disease group; 4) genes were arranged by the fold change starting with the largest one 
(Fold change calculation was determined by using logarithmic values in Example 2); and 5) the 
top 200 up-regulated genes and bottom 200 down-regulated genes were selected. The genes 
identified in this study are listed in Tables 3 (normal vs. BPH with symptoms, up regulated) and 
4 (normal vs. BPH with symptoms, down regulated, values are negative fold-change from 
normal). 

Example 3 : Selection of Cell lines used for Multi Gene Screening 

[0100] A number of cultured cell lines were tested to determine the similarity in gene 
expression profiles to BPH tissue. Cells were cultured in 6-well plates using the appropriate 
medium for each cell line. After reaching 90% confluency, cells were lysed with Trizol 
(GiboBRL) and total RNA was extracted. mRNA was then isolated, cDNA and cRNA was 
synthesized, and gene expression levels were determined by the Affymetrix Human 42K Gene 
Chip set as described in more detail above. 

[0101] The gene expression profiles were compared with those of prostatic tissue samples. A 
panel of 61 genes whose expression levels were up-regulated in BPH with symptoms compared 
with normal samples and with small variation among samples (within BPH samples and within 
normal samples) were assayed. The number of genes whose signal intensity was more than 100 
in each cell line is summarized in Table 6. A panel of 43 genes whose expression levels were 
down-regulated in BPH patient with small variation among samples was also assayed. The 
number of genes whose signal intensity in Affymetrix Gene Chip was "Present call" is also 
included in Table 6. 

[0102] Forty-eight to 58% of genes applied for this analysis were expressed in the cell lines of 
Table 6. These results indicate that cell lines, BRF-55T (Biological Research Faculty & Facility 
Inc.), PZ-HPV7 (ATCC; CRL-2221) , BPH-1 (S.W. Hayward et al, In Vitro Cell Dev. Biol. 31 A, 
14-24, 1995) and LNCaP (ATCC; CRL-1740) can be used as a BPH - like cell population to 
screen for compounds which are capable of modulating gene expression profiles from the disease 
state to a normal state. In particular, BRF-55T is a useful cell line for screening in the assays of 
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the invention, because 58% genes of the assayed genes were differentially expressed in BRF-55T 
as compared to BPH with symptoms tissue. 

Example 4 : Cluster analysis of up- or down-regulated genes in BPH 

[0103] Cluster analysis of the expression results from a large number of genes is often 
problematic due to variations in the standardization of the gene expression data. To compensate 
for these variations, a subset of differentially expressed genes was selected by a modified 
analysis procedure. 

[0104] In a first step, a gene list comparing normal vs. disease samples was generated by two 
kinds of comparisons. First, genes were selected that displayed a greater than or equal to mean 
2-fold up or down regulation using average difference expression values and with p<0.05. 
Second, genes were selected by ANOVA comparing the normal group of samples with the 
disease group and with a t value of >3 in the up or down direction. These lists were then 
combined to create an expression profile characteristic of normal controls and one characteristic 
of disease in which specific genes are found to be up or down regulated in disease when 
compared with normal controls. 

[0105] In preparation for clustering analysis to identify subgroups of genes that show 
statistically similar expression patterns, average difference values for the selected genes were 
normalized across all samples (normal and disease combined) using the following formula: 
Normalization data = (X - Xmean)/Sx 

Where Sx is variance (:STD) 

[0106] This converts the mean expression value for each gene to 0 and the high and low values 
to 1 and -1, respectively. Thus, genes with high absolute expression values when compared with 
genes with low absolute expression values would not skew the comparisons when clustering 
algorithms are applied. 

[0107] The measurement of the cluster space distance was determined by using the correlation 
coefficient (1-r) method and clustering was performed using Ward's method (WardJ.H. (1963) 
Journal of American Statistical Association, 58. 236.) 

[0108] The clustering was validated by observing whether multiple elements representing the 
same genes showing the same direction of expression change {i.e., either up or down) tend to 
cluster together. To test this standardization and clustering protocol, the expression levels for 
genes that are represented by more than one element on the 42K gene chip set were analyzed to 
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detennine whether the multiple elements for a single gene could be clustered together. For 
example, tryptase, also known as alpha tryptase or beta (tryptase II) is represented by two 
separate elements on the 42K human gene chip. This gene is registered with 2 different element 
names 41268 (5), M33493_s_at (code name, Up-170) and 26389 (3), rc_AA131322_s_at (code 
name, Up-010). 

[0109] It was found that the best analysis means for decreasing measurement errors between 
these two elements is by the Ward method as it gave the most consistent results when compared 
to other clustering methods. These analysis methods may be incorporated into software or 
computer readable storage media for storing a computer programmer software. 

Example 5 : Selection of 60 Marker Genes 

[0110] A panel of 60 representative marker genes (listed in Table 5) out of 400 marker genes 
listed in Tables 3 and 4 can be used in the assays and methods of the invention. The 60 marker 
genes were selected based on following criteria: (1) expression level is changed greatly in BPH 
patient samples compared with normal samples; (2) variation of expression levels within BPH 
samples and within normal samples is small; and (3) expression levels resembling BPH with 
symptoms are detected in cell line BRF-55T. 

Example 6: Gene Expression Analysis of Select Genes 

[0111] The expression levels of three genes from Tables 1-5 (the genes encoding cellular 
retinol binding protein, SI 00 calcium binding protein and PSMA) were assayed in various tissues 
and prostate samples by PCR as described in Example 7 (see Figures 1-6). Each sample was 
assayed for the level of GAPDH and mRNA corresponding to cellular retinol binding protein, 
SI 00 calcium binding protein or PSMA. As seen in Figures 1-6, these three genes are 
differentially regulated or expressed in BPH tissue from patients with or without symptoms and 
from BPH tissue from patients with prostate cancer (compared to normal prostate tissue). All 
three genes are therefore useful markers in the assays of the invention, such as the assays to 
measure the effect of an agent on BPH or the assays to detect or diagnose the occurrence or 
progression of BPH. 



Example 7: Drug Screening Assays 
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[0112] The expression profiles for normal controls and disease samples described above can be 
used to identify compound hits from a compound library. A hit may be defined as one of three 
kinds of results: 

[0113] 1) The expression of an individual gene is changed in the direction of normal (i.e., if up 
in disease, then down=hit, if down in disease, then up=hit), The stronger the modulation of an 
individual gene to a normal phenotype, the stronger the hit status for the compound against that 
gene. 

[0114] 2) The expression of genes that subcluster together is evaluated for an overall pattern of 
modulation to a normal expression profile. The more genes in a subcluster that are modulated to 
a normal phenotype, the stronger the hit status for the compound against that subcluster. A 
subcluster may represent common or interacting cellular pathways. 

[0115] 3) The overall expression profile of all of the genes being screened is evaluated for 
modulation to normal. The more genes that are modulated to a normal phenotype, the stronger 
the hit status for the compound against the entire gene set. 

[0116] As described above, if a compound modulates the gene expression pattern of the 
screening system cells more towards any disease phenotype, then it can be used as a molecular 
probe to find binding proteins and/or define disease-associated cellular pathways. 
[0117] As an example, candidate agents and compounds are screened for their ability to 
modulate the expression levels of cellular retinol binding protein, SI 00 calcium binding protein 
and PSMA by exposing a prostate cell line or cell line from BPH tissue to the agent and assaying 
the expression levels of these genes by real time PCR. Real time PCR detection is accomplished 
by the use of the ABI PRISM 7700 Sequence Detection System. The 7700 measures the 
fluorescence intensity of the sample each cycle and is able to detect the presence of specific 
amplicons within the PCR reaction. Each sample is assayed for the level of GAPDH and mRNA 
corresponding to cellular retinol binding protein, SI 00 calcium binding protein and PSMA. 
GAPDH detection is performed using Perkin Elmer part#402869 according to the manufacturer's 
directions. Primers were designed for the three genes by using Primer Express, a program 
developed by PE to efficiently find primers and probes for specific sequences ((1) N91971 - 
FAM PROBE Forward: 5'- CAT ggC TTT gTT TTA AgA AAA ggA A -3'; Reverse: 5'- AgC 
CAC CCC CAg gCA T -3'; Probe: 5'-FAM - AgT gAC AAA gCC AAg AgA CAg ACT CTg 
CTA ACA - TAMRA-3'; (2) X65614 - SYBR; Forward: 5*- AAA gAC AAg gAT gCC gTg 
gAT -3'; Reverse 5 '-AgC CAC gAA CAC gAT gAA CTC-3'; (3) M99487-SYB; Forward 5'- 
Tgg CTC AgC ACC ACC Aga T-3'; Reverse: 5'-TTC Cag TAA AgC Cag gTC CAA-3') 
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[0118] These primers are used in conjunction with SYBR green (Molecular Probes), a 
nonspecific double stranded DNA dye, to measure the expression level mRNA corresponding to 
the genes, which is normalized to the GAPDH level in each sample. 

[0119] Normalized expression levels from cells exposed to the agent are then compared to the 
normalized expression levels in control cells. Agents that modulate the expression of one or 
more the genes may be further tested as drug candidates in appropriate BPH in vitro or in vivo 
models. 

Example 8 Diagnostic assays 

[0120] The expression profiles or one or more of the individual genes of Tables 1-5 are used as 
molecular or diagnostic markers to evaluate the disease status of a patient sample. In one 
embodiment, a patient prostate tissue sample is processed as described herein to produce total 
cellular or mRNA. The RNA is hybridized to a chip continuing probes that specifically hybridize 
to one or more, or two or more of the genes in Tables 1-5. The overall expression profile 
generated, or the expression levels of individual genes are then compared to the profiles as 
described in Tables 1-5 to determine the disease or hyperplastic state of the patient sample. 
[0121] Although the present invention has been described in detail with reference to examples 
above, it is understood that various modifications can be made without departing from the spirit 
of the invention. Accordingly, the invention is limited only by the following claims. All cited 
patents, applications, GenBank Accession numbers and publications referred to in this 
application are herein incorporated by reference in their entirety. 
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